Fix health check failures when Docker proxy is configured#103
Merged
Conversation
…N unbound variable (#79) Two bugs fixed: 1. Health check curl commands inside containers did not bypass the Docker daemon proxy (http_proxy/https_proxy). When users have a system-wide Docker proxy configured, curl routes internal Docker network requests (e.g. http://elasticsearch:9200) through the proxy, causing the health check to fail even though Elasticsearch is running correctly. Adding --noproxy '*' ensures health checks always connect directly within the Docker network. Affected: ES health check, kibana_settings password setup curl, and Kibana health check. 2. On rolling-release distributions like Arch Linux, /etc/os-release does not define the VERSION variable (only VERSION_ID). Because the script runs with set -eu, referencing $VERSION caused an unbound variable error that crashed the error-log generation, hiding the real failure message. Fixed by using ${VERSION:-} as a safe default. Fixes #79 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
❌ Author of the following commits did not sign a Contributor Agreement: Please, read and sign the above mentioned agreement if you want to contribute to this project |
- tests/get_os_info_test.sh: unit test verifying ${VERSION:-} does not crash
under set -eu when VERSION is absent from /etc/os-release (Arch Linux).
Includes a negative test confirming bare $VERSION would have crashed.
- tests/docker/proxy.sh: integration test verifying ES and Kibana health checks
succeed when Docker client proxy config injects HTTP_PROXY into containers.
Configures ~/.docker/config.json to point at a non-existent proxy
(127.0.0.1:19999), then runs start-local.sh and asserts both services
return HTTP 200.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The kibana_settings Docker Compose command uses bash -c '...' with single quotes. Using --noproxy '*' inside this context breaks the outer single-quoting (the ' in '*' closes the bash -c argument). Use --noproxy "*" (double quotes) instead, which is safe inside a single-quoted context and passes the literal * to curl. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ezimuel
approved these changes
Mar 9, 2026
Collaborator
ezimuel
left a comment
There was a problem hiding this comment.
LGTM, thanks @amitkanfer
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #79
Three bugs fixed, all affecting the same user flow.
Bug 1 — Health check fails when Docker proxy is configured
When users have
http_proxy/https_proxyset in the Docker daemon config (e.g./etc/systemd/system/docker.service.d/or~/.config/docker/config.json), Docker injects those proxy env vars into every container. The health checkcurlcommands then attempt to route internal Docker network requests (e.g.http://elasticsearch:9200,http://kibana:5601) through the proxy, which either can't reach those hostnames or rejects them — causing the health check to time out after all retries (~584s) even though Elasticsearch is running fine (as confirmed by the reporter: manualcurl http://localhost:9200from the host returns 200).Fix: Add
--noproxy '*'to allcurlcommands that contact internal Docker network hostnames:Bug 1b —
kibana_settingspassword setup silently broken insidebash -c '...'The
kibana_settingsservice command isbash -c '...'(outer single quotes). Adding--noproxy '*'inside that context breaks the shell quoting: the'in'*'closes the outer single-quoted string, causing bash to receive a truncated script and exit immediately with an error — so the Kibana system password is never set and the stack fails to start.Fix: Use
--noproxy "*"(double quotes) for thekibana_settingscurl command. Inside a single-quotedbash -c '...'string, double quotes are literal characters passed through to bash, which then interprets"*"as a double-quoted glob-safe*— the correct value for curl's--noproxyargument.Bug 2 —
VERSION: unbound variablecrash in error log generationThe script uses
set -eu. When a failure occurs,generate_error_log()callsget_os_info(), which sources/etc/os-releaseand then prints$VERSION. On rolling-release distributions like Arch Linux,/etc/os-releasedoes not defineVERSION(onlyVERSION_ID). Withset -eu, this causes an unbound variable crash — so the error log is never written and the user sees a confusingVERSION: unbound variablemessage instead of the actual failure details.Fix: Use
${VERSION:-}to safely default to an empty string whenVERSIONis not defined.Tests added
tests/get_os_info_test.sh(runs in all CI jobs)test_get_os_info_succeeds_when_VERSION_is_not_defined: creates a mock/etc/os-releasewithout aVERSIONfield (simulating Arch Linux) and verifies the fixed code path (${VERSION:-}) exits 0 underset -eutest_bare_VERSION_crashes_under_set_eu_when_not_defined: confirms the unfixed pattern ($VERSION) would have crashed, making the above test meaningfultests/docker/proxy.sh(runs in Docker CI jobs)Configures
~/.docker/config.jsonto inject a non-existent proxy (http://127.0.0.1:19999) into all containers — simulating a user with a corporate proxy in their Docker daemon config. Runsstart-local.shand asserts both services respond:test_elasticsearch_accessible_with_proxy_configured: HTTP 200 fromlocalhost:9200test_kibana_accessible_with_proxy_configured: HTTP 200 fromlocalhost:5601All 4 CI jobs (ubuntu-22.04 Docker, ubuntu-24.04 Docker, ubuntu-22.04 Podman, ubuntu-24.04 Podman) pass.
Test plan
http_proxyset in Docker daemon config — health checks pass (verified in CI viatests/docker/proxy.sh)tests/get_os_info_test.sh)🤖 Generated with Claude Code